Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Text Extraction, Enhancement and OCR in Digital Video

Identifieur interne : 001F33 ( Main/Exploration ); précédent : 001F32; suivant : 001F34

Text Extraction, Enhancement and OCR in Digital Video

Auteurs : Huiping Li [États-Unis] ; David Doermann [États-Unis] ; Omid Kia [États-Unis]

Source :

RBID : ISTEX:EB8C6F62EFB36A70FC1A1334A7AD19CDE6B2FEB3

Abstract

Abstract: In this paper we address the problem of text extraction, enhancement and recognition in digital video. Compared with optical character recognition (OCR) from document images, text extraction and recognition in digital video presents several new challenges. First, the text in video is often embedded in complex backgrounds, making text extraction and separation difficult. Second, image data contained in video frames is often digitized and/or subsampled at a much lower resolution than is typical for document images. As a result, most commercial OCR software can not recognize text extracted from video. We have implemented a hybrid wavelet/neural network segmenter to extract text regions and use a two stage enhancement scheme prior to recognition. First, we use Shannon interpolation to raise the image resolution, and second we postprocess the block with normal/inverse text classification and adaptive thresholding. Experimental results show that our text extraction scheme can extract both scene text and graphical text robustly and reasonable OCR results are achieved after enhancement.

Url:
DOI: 10.1007/3-540-48172-9_29


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Text Extraction, Enhancement and OCR in Digital Video</title>
<author>
<name sortKey="Li, Huiping" sort="Li, Huiping" uniqKey="Li H" first="Huiping" last="Li">Huiping Li</name>
</author>
<author>
<name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
<affiliation>
<country>États-Unis</country>
<placeName>
<settlement type="city">College Park (Maryland)</settlement>
<region type="state">Maryland</region>
</placeName>
<orgName type="university" n="3">Université du Maryland</orgName>
</affiliation>
</author>
<author>
<name sortKey="Kia, Omid" sort="Kia, Omid" uniqKey="Kia O" first="Omid" last="Kia">Omid Kia</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:EB8C6F62EFB36A70FC1A1334A7AD19CDE6B2FEB3</idno>
<date when="1999" year="1999">1999</date>
<idno type="doi">10.1007/3-540-48172-9_29</idno>
<idno type="url">https://api.istex.fr/document/EB8C6F62EFB36A70FC1A1334A7AD19CDE6B2FEB3/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000011</idno>
<idno type="wicri:Area/Istex/Curation">000011</idno>
<idno type="wicri:Area/Istex/Checkpoint">001484</idno>
<idno type="wicri:doubleKey">0302-9743:1999:Li H:text:extraction:enhancement</idno>
<idno type="wicri:Area/Main/Merge">002042</idno>
<idno type="wicri:Area/Main/Curation">001F33</idno>
<idno type="wicri:Area/Main/Exploration">001F33</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Text Extraction, Enhancement and OCR in Digital Video</title>
<author>
<name sortKey="Li, Huiping" sort="Li, Huiping" uniqKey="Li H" first="Huiping" last="Li">Huiping Li</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Maryland</region>
</placeName>
<wicri:cityArea>Language and Media Processing Laboratory Institute for Advanced Computer Studies, University of Maryland College Park, 20742-3275</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Maryland</region>
</placeName>
<wicri:cityArea>Language and Media Processing Laboratory Institute for Advanced Computer Studies, University of Maryland College Park, 20742-3275</wicri:cityArea>
<placeName>
<settlement type="city">College Park (Maryland)</settlement>
<region type="state">Maryland</region>
</placeName>
<orgName type="university" n="3">Université du Maryland</orgName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
<placeName>
<settlement type="city">College Park (Maryland)</settlement>
<region type="state">Maryland</region>
</placeName>
<orgName type="university" n="3">Université du Maryland</orgName>
</affiliation>
</author>
<author>
<name sortKey="Kia, Omid" sort="Kia, Omid" uniqKey="Kia O" first="Omid" last="Kia">Omid Kia</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Maryland</region>
</placeName>
<wicri:cityArea>Advanced Network Technologies Division, National Institute of Standards and Technology Gaithersburg</wicri:cityArea>
</affiliation>
<affiliation>
<wicri:noCountry code="no comma">E-mail: omid.kia@nist.gov</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>1999</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">EB8C6F62EFB36A70FC1A1334A7AD19CDE6B2FEB3</idno>
<idno type="DOI">10.1007/3-540-48172-9_29</idno>
<idno type="ChapterID">29</idno>
<idno type="ChapterID">Chap29</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: In this paper we address the problem of text extraction, enhancement and recognition in digital video. Compared with optical character recognition (OCR) from document images, text extraction and recognition in digital video presents several new challenges. First, the text in video is often embedded in complex backgrounds, making text extraction and separation difficult. Second, image data contained in video frames is often digitized and/or subsampled at a much lower resolution than is typical for document images. As a result, most commercial OCR software can not recognize text extracted from video. We have implemented a hybrid wavelet/neural network segmenter to extract text regions and use a two stage enhancement scheme prior to recognition. First, we use Shannon interpolation to raise the image resolution, and second we postprocess the block with normal/inverse text classification and adaptive thresholding. Experimental results show that our text extraction scheme can extract both scene text and graphical text robustly and reasonable OCR results are achieved after enhancement.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Maryland</li>
</region>
<settlement>
<li>College Park (Maryland)</li>
</settlement>
<orgName>
<li>Université du Maryland</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Maryland">
<name sortKey="Li, Huiping" sort="Li, Huiping" uniqKey="Li H" first="Huiping" last="Li">Huiping Li</name>
</region>
<name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
<name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
<name sortKey="Kia, Omid" sort="Kia, Omid" uniqKey="Kia O" first="Omid" last="Kia">Omid Kia</name>
<name sortKey="Li, Huiping" sort="Li, Huiping" uniqKey="Li H" first="Huiping" last="Li">Huiping Li</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001F33 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001F33 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:EB8C6F62EFB36A70FC1A1334A7AD19CDE6B2FEB3
   |texte=   Text Extraction, Enhancement and OCR in Digital Video
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024